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TDB-ACC-NO: NN86034506 

DISCLOSURE TITLE: RggiRtfir Allocation 
PUB L I CAT I ON - DATA : 

IBM Technical Disclosure Bulletin, March 1986, US 

VOLUME NUMBER: 2 8 

ISSUE NUMBER: 10 

PAGE NUMBER: 4506 - 4513 

DISCLOSURE TEXT: 

- An algorithm is disclosed that allocates machine registers for 
IBM System/370 architecture. The above figure depicts a flow chart 
for an improved register allocator, the major task of which is to 
assign unlimited "symbolic registers" (SR) to the finite register 
set on the hardware. Following is a summary of each stage in the 
process: 1. Right Number of Names (RNN) . A SR may have several 
disjoint lifetimes, i.e., different sequences of instructions in 
which it can reside in distinct real registers. RNN calculates the 
lifetimes of all SRs, and separates the disjoint lifetimes of each 
SR into different "names". Each of these names is actually assigned 
to a register. 2. Reduce Register Pressure (RRP) and RX 
Rematerialization. The register pressure at an instruction is the 
number of names alive at that instruction. Since each name needs a 
real register, the register pressure at every point in the program 
must be less than or equal to the number of available hardware 
registers for register alloratinn to succeed. RRP reduces register 
pressure at points where it is too high by "spilling", i.e., 
storing some live values into memory, and later reloading them. 
Earlier compiler phases express all operations in terms of 
register-to-register (RR) instructions; this is most efficient when 
an SR is used several times. However, if the SR is only used once, 
it is often possible to employ a single memory to register (RX) 
instruction to achieve the same effect as the two instruction 
sequence of: Load, RR instruction. RX Rematerialization converts 
such sequences into the equivalent RX instruction where possible. 
3. Right Number of Names (RNN) . RNN is run again after spilling 
because the STORE/LOAD pairs break single lifetimes into several 
lifetimes, and RX rematerialization eliminates the need for some 
registers altogether, so RNN yields a result different from the 
first time. More important, it is easier to allocate registers for 
the resulting "names". 4. Build Interference Graph (BIG). Two names 
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. are said to interfere if their lifetimes overlap. The interference 
relation is represented by the interference graph (IG) , which is an 
undirected graph with names as nodes, and edges connecting 
interfering names. A coloring of a graph is an assignment of 
integers (real registers) to nodes (names) such that adjacent nodes 
(interfering names) are assigned different integers (real 
registers) . iteg-i flt.gr allocation is equivalent to coloring the 
interference graph with real register colors. BIG constructs the 
interference graph by examining each instruction and adding edges 
between all names which are live at that point. The IG is also 
exploited to express some of the irregularities of the hardware. 
For instance, to prevent register zero (RO) from being used as a 
base or inri^x register, RO is made to interfere with every name 
appearing as a base or index in some instruction. This process is 
also accomplished during the scan of the program. 5. Coalesce. 
Earlier phases of the compiler insert (explicitly and implicitly) 
numerous load register instructions: LR Ra, Rb If Ra and Rb are 
mapped to the same real register, then the LR can be eliminated. 
The purpose of Coalesce is to cause as many LRs to be eliminated by 
this method as possible. Coalesce does this by scanning the program 
for LRs. If the names of the two symbolic registers do not 
interfere, the names are combined into a single name and their 
interferences are joined. In terms of the IG, one of the nodes is 
chosen as the representative, and the other's edges are added to 
it. This guarantees that Color will assign the original two names 
to the same register. 6. Color. This colors the IG (as modified by 
Coalesce) with a real register number of colors to provide actual 
mapping of names to real registers. If the coloring attempt is 
unsuccessful, Fixup is invoked to fix those names that remain 
uncolored. 7. Fixup. Fixup is executed if any names remain 
uncolored after Color. Code is inserted to cause these names to be 
spilled everywhere except at the individual instructions where 
mentioned. 8. Spill Temp Allocation. Just as names must be assigned 
to real registers, so too must the spill temps be assigned to 
memory locations over their lifetimes. The problem is similar to 
reg-i ster allocation . Spill temps having non-overlapping lifetimes 
can share the same storage location. The major difference is that 
there are an unlimited number of memory locations, instead of a 
fixed set of registers. Spill temp allocation uses a modified 
version of BIG and Color for its processing. The structure of the 
improved register allocator is more efficient than that of the 
prior art. The improved register allocator makes fewer passes over 
the program; register pressure is always reduced before attempting 
to Color, since it is almost always necessary to spill something; 
the algorithm never loops - if there are remaining uncolored names, 
they are spilled by Fixup; and the IG data structure is represented 
in a space-efficient manner, at a minor decrease in time 
efficiency. The IG consists solely of the adjacency lists (halved) . 
There are two instances where the prior-art bit matrix is 
advantageous. 1. In BIG, duplicates are kept out of the adjacency 
list by testing matrix (i, j) . 2. In Coalesce, for each LF i,j , 
Matrix (i,j) is tested to see if i and j interfere. In the improved 
allocator, the adjacency lists are used alone in these two 
situations, and are not inefficient because: 1. In BIG, duplicates 
are not forbidden. So BIG simply adds j to i's adjacency list 
without checking for a duplicate. This potentially uses too much 
space, so if a space limit is reached, BIG performs a "garbage 
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. collection' 1 . The process uses a single temporary bit vector, 

initialized to all zeros before using on each list. Each adjacency 
list is scanned: if the corresponding bit is '1', the entry is a 
duplicate and is reclaimed; otherwise, the bit is set to '1'. In 
practice, few duplicates are generated, because edges are added at 
the definition point of a name, and most names have a single 
definition point. 2. During Coalesce, it is necessary to decide if 
i and j interfere. Thus it is necessary to traverse i's list 
looking for j, which might be inefficient if i's list is long. But, 
there are several reasons why this inefficiency does not arise, a. 
Garbage collection is performed at the beginning of Coalesce, so 
the lists are as short as possible, b. Since RRP is run before 
Coalesce, register pressure has been reduced to a fixed maximum, 
say, 14. This implies that most names interfere with less than 14 
other names; i.e., the average number of interferences for a name 
is independent of the total number of names, c. The lists are 
halved, so the average list length is half of the number of 
adjacent elements, e.g., 6.5. d. If i and j do interfere, then on 
average j will be found after scanning only half of i's list, e.g., 
3.25. e. If i and j do not interfere, then i's entire list will be 
scanned. In this case, i and j will be coalesced, but the Coalesce 
action requires scanning i's list anyway. Thus, at worst, it costs 
twice as much, but the cost is not proportional to the length of 
i's list. The improved algorithm deals with 16 general registers 
and 4 floating point registers, whereas the prior-art algorithm 
only handles the general registers. The floating-point registers 
are independent of the general registers, i.e., there are no 
instructions using registers from the two register classes. 
Therefore, the improved process identifies symbolic registers by 
class, and the register allocation process is applied to the two 
classes independently. For efficiency, the two classes are treated 
in parallel during a single execution of the steps in the figure. 
This permits fewer passes over the program. The improved algorithm 
deals with register pairs, whereas the prior-art process does not. 
Certain instructions on the IBM System/370 use even-odd pairs of 
general registers: a. Multiply: M, MR b. Divide: D, DR c . Move Long 
MVCL, and Compare Long CLCL d. Branch on Index High, BXH, and 
Branch on Index Low or Equal BXLE. Register pairs are more 
difficult to allocate optimally than floating-point registers, 
because the pairs overlap with the single registers. The mechanism 
is as follows: 1. The SRs are classified as single or pair in the 
intermediate language. 2. There are intermediate language 
instructions that convert singles to/from pairs : - Combine: 
generates a pair from two singles. This operation normally expands 
into two Load Register (LR) hardware instructions, to move the two 
single registers into the corresponding members of the pair. The 
instructions in classes c and d above have pair registers as input, 
which are generated by Combine. - Extract_Odd, Extract_Even : 
generates a single from a pair by extracting the odd or even number 
of the pair, respectively. This operation normally expands into the 
machine instruction Load Register (LR) , to move the value from a 
member of the pair to the single register. The instructions in 
classes a and b above create pair results from which single values 
are obtained using Extract. 3. RNN. Pairs are treated like singles. 
4. RRP. Register pressure is computed for the general registers 
counting 2 for pairs and 1 for singles. RX Rematerialization treats 
pairs like singles. 5. BIG. Pairs are treated like singles: pairs 
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can interfere with singles, and vice versa. 6. Coalesce. There are 
two unique processes of performing coalesce with pairs: A. In one 
process, pairs only participate in Coalesce when they appear in 
Load Pair Register instructions, which copy from one pair register 
to another. Pairs do not participate in any other way during 
Coalesce. B. The second and 

SECURITY: Use, copying and distribution of this data is subject to the restictions in the Agreement 
For IBM TDB Database and Related Computer Databases. Unpublished - all rights reserved under the 
Copyright Laws of the United States. Contains confidential commercial information of IBM exempt from 
FOIA disclosure per 5 U.S.C. 552(b)(4) and protected under the Trade Secrets Act, 18 U.S.C. 1905. 

COPYRIGHT STATEMENT: The text of this article is Copyrighted (c) IBM Corporation 1986. All rights 
reserved . 
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guided prefetching for irregular code 



PUBLICATION- DATE : July 3, 2003 



INVENTOR- INFORMATION : 
NAME 

Wu, Youfeng 

Se rrano , Maur i c io 



CITY 

Palo Alto 
San Jose 



STATE 

CA 

CA 



COUNTRY 

US 

US 



RULE-47 



US -CL- CURRENT: 71 7 / 1 58 ; 71 7 / 1 S9 
ABSTRACT : 

A compiler technique uses prof i 1 ^ feedback to determine stride values for memory 
references, allowing prefetching of instructions for those loads that can be 
effectively prefetched. The compiler first identifies a set of loads, and 
-i nflt-r-nTTiemt-.fl the loads to prof i le the difference between the successive load 
addresses in the current iteration and in the previous iteration. The frequency of 
stride difference is also profiled to allow the compiler to insert prefetching 
instructions for loads with near-constant strides. The compiler employs code 
analysis to determine the best prefetching distance, to reduce the prof i 1 ing cost, 
and to reduce the prefetching overhead. 
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US -CL- CURRENT : 717 / 1 5ft 
ABSTRACT : 



A method for executing a code is provided. The method includes receiving a trigger 
instruction, selecting an entry in a trigger table, the entry associated with the 
trigger instruction, and executing an auxiliary code referenced by the entry in the 
trigger table. 
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US -CL- CURRENT: 71 7 / 140 
ABSTRACT : 

The present invention is a method and system to support debug. A function is 
compiled. The function includes a byte code sequence having a field byte code that 
accesses or modifies a field. The compiled function provides a native code and 
occupies a code space. An instrumentation code corresponding to a field match of a 
field is generated. The i nstrumentation code is inserted to the native code. 
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INVENTOR - INFORMATION : 
NAME 

Lueh, Guei-Yuan 
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STATE 
CA 



COUNTRY 
US 



RULE- 4 7 



US -CL- CURRENT: 212/l2£ 
ABSTRACT : 

The present invention is a method and system to support debug. A function is 

re -compiled when a field watch for a field is activated. The function includes a 

byte code sequence having a field byte code that accesses or modifies the field. The 

re-compiled function provides a native code and occupies a code space. An 

i nshmmpntatinn code corresponding to the field watch of the field is generated. The 

•i nst-mmpntation code is inserted to the native code. 



Crtjtion | Fiont j Review j Class rticjtion j Date ] Reference | Sequences { Attachments 



□ 5. Document ID: US 20020083425 Al 
Lll: Entry 5 of 9 File: 



PGPB 



Jun 27, 2002 



PGPUB -DOCUMENT -NUMBER: 20020083425 
PGPUB- FILING -TYPE: new 

DOCUMENT -IDENTIFIER: US 20020083425 Al 

TITLE: System and method for obtaining scratch -r^gi rI-rt-h in computer executable 
binaries 

PUBLICATION- DATE: June 27, 2 002 



INVENTOR- INFORMATION : 
NAME 

Gillies, David M. 
Chaiken, Ronnie 
Liu, Jiyang 



CITY 
Bellevue 
Woodinville 
Sammamish 



STATE 
WA 
WA 
WA 



COUNTRY 

US 

US 

US 



RULE-47 



US -CL- CURRENT: 212/l£fi 
ABSTRACT : 

A system and method for obtaining scratch registers in a computer- executable binary 
is provided. Register allocation requests in a computer -executable binary are 
discovered. In one method, the -register allo cations are examined 

procedure -by-procedure . The maximum number of registers requested by any instruction 
in the procedure is discovered. Then, register requests in the procedure are 
modified to request the maximum number discovered plus a number of gcrat-ch 
registers. In another method, the register al 1 ocati nn.g are examined block-by block 
within a procedure. Dominating rpgifit-.er allocations for each block are found. Then 
the dominating register allocations are modified to request scratch regi sters . 
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TITLE: Emulation system that uses dynamic binary translation and permits the safe 
speculation of trapping operations 

DATE - ISSUED : October 7, 2003 
INVENTOR- INFORMATION : 
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Le; Bich-Cau San Jose CA 

US - CL - CURRENT : 71 7 / 137 ; 717 /136, 71 7/1 ST , 71 7/ 1 S9 
ABSTRACT : 

The inventive emulator dynamically translates instructions in code written for a 
first architecture into code for a second architecture. The emulator designates 
various checkpoints in the original code, and speculatively reorders the placement 
of the translated code instructions according to optimization procedures. If during 
the execution of the reordered code, a trap should occur, then the emulator resets 
the original code to the most recent checkpoint and begins executing the original 
code sequentially in a line -by- line manner until the section is completed or 
branched out of . The original code is reset by changing the program counter to the 
checkpoint, and reversing the effects of each instruction which has been executed 
subsequent to the checkpoint. Thus, any native instructions which correspond to 
original instructions which occur sequentially prior to the checkpoint have been 
executed, and any native instructions which correspond to original instructions 
which occur sequentially subsequent to the checkpoint have not been executed. 

52 Claims, 9 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 3 



C lass itication ! Date j Reference 



□ 7. Document ID: US 6625807 Bl 
Lll: Entry 7 of 9 File: USPT Sep 23, 2003 
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TITLE: Apparatus and method for efficiently obtaining and utilizing register usage 
information during software binary translation 

DATE-ISSUED: September 23, 2003 
INVENTOR- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Chen; Ding-Kai San Jose CA 

US -CL- CURRENT: 717/154; 103./ 23., 212./ 222, 71 7/ 1 ?ft , 717/ 1 ^6 
ABSTRACT : 

Apparatus and method are described for register optimization during code translation 
and utilizes a technique that removes the time overhead for analyzing register 
usage, and eliminates fixed restraints on the compiler register usage. The present 
invention for register optimization utilizes a compiler to produce a bit vector for 



each program unit (^^. , subroutine, function, and/oi^^rocedure) . Each bit in the 
bit vector represents a particular caller- saved register. A bit is set if the 
compiler uses the corresponding register within that program unit. During the 
translation, the translator examines the bit vector to very quickly determine which 
registers are free, and therefore can be used during register optimization without 
having to save and restore the register values. 

16 Claims, 12 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 10 
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US -CL- CURRENT: 2A3./22S1; 709/250 
ABSTRACT : 

An intelligent network interface card (INIC) or communication processing device 
(CPD) works with a host computer for data communication. The device provides a 
fast-path that avoids protocol processing for most messages, greatly accelerating 
data transfer and offloading time- intensive processing tasks from the host CPU. The 
host retains a fallback processing capability for messages that do not fit fast -path 
criteria, with the device providing assistance such as validation even for slow-path 
messages, and messages being selected for either fast-path or slow-path processing. 
A context for a connection is defined that allows the device to move data, free of 
headers, directly to or from a destination or source in the host. The context can be 
passed back to the host for message processing by the host. The device contains 
specialized hardware circuits that are much faster at their specific tasks than a 
general purpose CPU. A preferred embodiment includes a trio of pipelined processors 
devoted to transmit, receive and utility processing, providing full duplex 
communication for four Fast Ethernet nodes . 
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TITLE: System and method for performing selective dynamic compilation using run- time 
information 

DATE -ISSUED: July 30, 2002 



INVENTOR - INFORMATION : 



NAME 
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US -CL- CURRENT: 112/ IAD.; 71 7 /14A, 212/153. 
ABSTRACT : 

Selective dynamic compilation of source code is performed using run-time 
information. A system is disclosed that implements a declarative, annotation based 
dynamic compilation of the source code, employing a partial evaluation, binding- time 
analysis (BTA) , and including program-point-specific polyvariant division and 
specialization and dynamic versions of traditional global and peephole 
optimizations. The system allows programmers to declaratively specify policies that 
govern the aggressiveness of specialization and caching, providing fine control over 
the dynamic compilation process. The policies include directions for controlling 
specialization at promotion points and merge points, and further define caching 
policies, and speculative-specialization policies. The system also enables 
programmers to specialize programs across arbitrary edges, both at traditional 
locations, such as procedure boundaries, but also within procedures. Programmers are 
enabled to conditionally specialize programs based on evaluation of arbitrary 
compile-time and run-time conditions. 

20 Claims, 32 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 22 
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□ 1. Document ID: US 20020083425 Al 
L9: Entry 1 of 1 File: DWPI Jun 27, 2002 

CERWmn^^^ - NO : 2002-590185 
DERHENS-*WEEK: 200263 

COPYRIGHT 2003 DERWENT INFORMATION LTD 

TITLE: Computer- implemented method for i nfitrumpnt i ng binaries, 
involves modifying each register request from several register 
requests to request maximum number of discovered registers plus 
scratch registers 

INVENTOR: CHAIKEN, R; GILLIES, D M ; LIU, J 
PRIORITY- DATA: 2000US-0746949 (December 21, 2000) 
PATENT-FAMILY: 

PUB-NO PUB-DATE LANGUAGE PAGES MAIN- IPC 

US 20020083425 Al June 27, 2002 018 G06F009/44 

INT-CL (IPC) : E 2./4A 

ABSTRACTED- PUB-NO: US2002 008342 5A 
BASIC -ABSTRACT: 

NOVELTY - The maximum number of registers requested from several 
register requests are determined. Each register request is modified 
to request the maximum number of registers with selected number of 
scratch registers . 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for the 
following : 

(1) Computer system; and 

(2) Computer readable medium. 

USE - For i nstrument ing binaries for hardware architecture in 
computer (claimed) . 

ADVANTAGE - A linear search for maximum r^g-i at^r- allopat.inn is 
performed on a procedure -by-procedure basis, and the linear 
replacement is performed to provide scratch registers effectively 
without requiring the involved analysis. The snrat-.rh registers are 
provided using the same index throughout the procedure, thus 
simplifies the inserting of inflfTiimenting code effectively. 
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TITLE: Method and computer program product for adaptive inlining in a computer 
system 

DATE-ISSUED: February 27, 2001 



INVENTOR- INFORMATION : 
NAME 

Schmidt; William Jon 



CITY 

Rochester 



STATE ZIP CODE 
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US -CL- CURRENT: 71 7 / 1 SI ; 71 7 / 1 57 
ABSTRACT : 

A method and computer program product are provided for implementing adaptive 
inlining in a computer system. Call sites in a call multigraph are identified for 
possible inlining. A first approximation of initial call sites of the identified 
possible call sites are identified for inlining. Procedures in the call multigraph 
are processed in a determined order where a first procedure is only processed after 
all second procedures called by the first procedure are processed. The processing of 
the first procedure comprises the steps of determining whether any call site within 
the first procedure has been selected for inlining, and whether the second procedure 
called from the call site contains confirmed inlined call sites. If true, it is 
determined whether to confirm or reject the first approximation to inline the second 
procedure into the first procedure at the call site utilizing at least one 
predetermined criterion. 

18 Claims, 6 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 6 
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L13 : Entry 2 of 7 File: USPT 



Jan 16, 2001 



US-PAT-NO: 6175956 

DOCUMENT- IDENTIFIER: US 6175956 Bl 

TITLE: Method and computer program product for implementing method calls in a 
computer system 

DATE-ISSUED: January 16, 2001 
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NAME ^ CITY STATE CODE COUNTRY 

Hicks; Daniel Rodman Byron MN 

Schmidt; William Jon Rochester MN 

US -CL- CURRENT: 717 / 114 ; 717 / 1R7 ; 71 7 / 1 5ft 
ABSTRACT : 

A computer implemented method and computer program compiler product are provided for 
implementing method calls in a computer system. Virtual method calls are identified 
in an intermediate instruction stream representation. Responsive to an identified 
virtual method call, profile data for the identified call site are read. A most 
frequently called procedure for the identified call site is compared with a first 
threshold value. Responsive to the most frequently called procedure being called 
less than the first threshold value, the virtual method call is maintained in a 
revised instruction stream representation. Responsive to the most frequently called 
procedure being called greater than or equal to the first threshold value, a guarded 
call to the most frequently called procedure is inserted at the identified call site 
in the revised instruction stream representation. In accordance with features of the 
invention, checking whether one object type accounts for more than a second 
threshold value of the calls to the most frequently called procedure at the 
identified call site is performed. Responsive to one object type accounting for more 
than or equal to the second threshold value, a type guard and a call to the most 
frequently called procedure are inserted at the identified call site in the revised 
instruction stream representation. Responsive to one object type accounting for less 
than the second threshold value, an address guard and a call to the most frequently 
called procedure are inserted at the identified call site in the revised instruction 
stream representation. 

13 Claims, 6 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 6 
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□ 3. Document ID: US 6072951 A 

L13: Entry 3 of 7 File: USPT Jun 6, 2000 

US-PAT-NO: 6072951 

DOCUMENT- IDENTIFIER: US 6072951 A 

TITLE: Profile driven optimization of frequently executed paths with inlining of 
code fragment (one or more lines of code from a child procedure to a parent 
procedure) 

DATE-ISSUED: June 6, 2000 
INVENTOR- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Donovan; Robert John Rochester MN 

Roediger; Robert Ralph Rochester MN 

Schmidt; William Jon Rochester MN 

US - CL - CURRENT : 717/lSfl; 2U./15R 
ABSTRACT : 

A compiler and method of compiling provide enhanced performance by inlining one or 
more frequently executed paths through a child procedure into a parent procedure 
without inlining the entire child procedure. Accordingly, a substantial improvement 
in speed of execution of the program can be achieved by reducing procedure call 
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overhead, with reduced^^pense in terms of program size^^ compared to traditional 
inlining. Various criteria for determining whether to inline particular child 
procedures are also described. 

26 Claims, 4 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 4 
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□ 4. Document ID: US 6029000 A 

L13: Entry 4 of 7 File: USPT Feb 22, 2000 

US-PAT-NO: 6029000 

DOCUMENT- IDENTIFIER: US 6029000 A 

TITLE: Mobile communication system with cross compiler and cross linker 
DATE-ISSUED: February 22, 2000 

INVENTOR- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Woolsey; Matthew A. Piano TX 

Lineberry; Marion C. Dallas TX 

Kim; Jihong Taegu KR 

US - CL - CURRENT : 717 / 147 ; 717/ 114 , 71 7 / 1 62 , 717 / 1 7^ 
ABSTRACT : 

A wireless data platform (10) comprises a plurality of processors (12,16) . Channels 
of communication are set up between processors such that they may communicate 
information as tasks are performed. A dynamic cross compiler (80) executed on one 
processor compiles code into native processing code for another processor. A dynamic 
cross linker (82) links the compiled code for other processor. Native code may also 
be downloaded to the platform through use of a JAVA Bean (90) (or other language 
type) which encapsulates the native code. The JAVA Bean can be encrypted and 
digitally signed for security purposes. 

17 Claims, 6 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 3 



Date j Refeience | Sequences i Attachments 



aUlr*B^MlJlj«l,.UUll 



□ 5. Document ID: US 6016398 A 

L13: Entry 5 of 7 File: USPT Jan 18, 2000 
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US -CL- CURRENT: 717 /152 
ABSTRACT : 

The invention is a method of using static single assignment intermediate language to 
color out artificial register dependencies while compiling at least a portion of a 
computer program. The method comprises creating a rank-n SSA intermediate language 
representation of the computer program, wherein n is a positive integer greater than 
1; and coloring out the artificial register dependencies. 

22 Claims, 13 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 12 
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DOCUMENT- IDENTIFIER: US 593 7195 A 

TITLE: Global control flow treatment of predicated code 

DATE- ISSUED: August 10 # 1999 

INVENTOR- INFORMATION : 

NAME CITY STATE ZIP CODE COUNTRY 

Ju; Dz-ching Sunnyvale CA 

Gillies; David M. Cupertino CA 

US -CL- CURRENT: 717 /156; 712 / 201 f 712 / 216 , 71 7 / 1 S9 



The relationships among predicates are tracked globally by uniformly treating both 
control flow and explicit predicates by mapping them to a single connected partition 
graph. This allows for the analysis of predicate relations based on the scope of an 
entire procedure. This predicate analysis can be invoked by various phases of 
compiler optimization without being constrained by an incremental update of any 
persistent data structures. 

40 Claims, 8 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 3 
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TITLE: System and method for recompiling computer programs for enhanced optimization 
DATE-ISSUED: June 16, 1998 



INVENTOR- INFORMATION : 
NAME 

Gillies; David M. 



CITY 
Ontario 



STATE 



ZIP CODE 



COUNTRY 
CA 



US -CL- CURRENT: 717 / 145 ; 717 / lRfi , 717 / 159 
ABSTRACT : 

An optimizing compiler for producing executable programs from code, high level 
languages compiles the code whilst generating data from which a callgraph may be 
constructed, and then recompiles the procedures identified in the callgraph in an 
order which reverses the topology of the callgraph while monitoring usage of 
hardware registers. Procedures which are rarely or never called, or result in 
termination of the program, are identified, and are modified if needed so that if 
called, registers which they may modify are saved prior to execution of the 
procedure and subsequently restored if necessary, so that in a calling procedure, 
subsequently recompiled, no account need be taken of possible register usage by the 
called procedure. This makes additional registers available to the calling 
procedure, and enables register storing and restoring which must otherwise be 
associated with the callsite to be eliminated. 

16 Claims, 5 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 3 
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1 An environment for research in microprogramming and emulation 80% 
[3 Robert F. Rosin , Gideon Frieder , Richard H. Eckhouse 

Communications of the ACM August 1972 

Volume 15 Issue 8 

The development of the research project in microprogramming and emulation at State 
University of New York at Buffalo consisted of three phases: the evaluation of various 
possible machines to support this research; the decision to purchase one such machine, which 
appears to be superior to the others considered; and the organization and definition of goals 
for each group in the project. Each of these phases is reported, with emphasis placed on the 
early results achieved in this research. 



2 System architectures for computer music 80% 
(3 John W. Gordon 

ACM Computing Surveys (CSUR) June 1985 

Volume 17 Issue 2 

Computer music is a relatively new field. While a large proportion of the public is aware of 
computer music in one form or another, there seems to be a need for a better understanding 
of its capabilities and limitations in terms of synthesis, performance, and recording hardware. 
This article addresses that need by surveying and discussing the architecture of existing 
computer music systems. System requirements vary according to what the system will be 
used for. Common uses for co ... 



3 Trap architectures for Lisp systems 77% 
[?j Douglas Johnson 

Proceedings of the 1990 ACM conference on LISP and functional programming May 

1990 

Recent measurements of Lisp systems show a dramatic skewing of operation frequency. For 
example, small integer (fix-num) arithmetic dominates most programs, but other number 
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types can occur on almost any operation. Likewise, few memory references trigger special 
handling for garbage collection, but nearly all memory operations could trigger such special 
handling. Systems like SPARC and SPUR have shown that small amounts of special 
hardware can significantly reduce the need for inline softwa ... 



4 Construction of a transportable, multi-pass compiler for extended Pascal 77% 
@) G. J. Hansen , G. A. Shoults , J. D. Cointment 

Proceedings of the 1979 SIGPLAN symposium on Compiler construction August 1979 
This paper describes the implementation of an extended Pascal compiler on the TI 990 
minicomputer, the TI 980 minicomputer, and the IBM System 370. The compiler was 
designed to be as machine independent as possible; the parser and machine independent 
optimizer are parameterized so they are identical at the source code level for all machines. 
The paper describes the language modifications and extensions to Pascal, the parser, the 
optimizer, and the code generators. In addition, the intermedi ... 

5 Performance monitoring: METRIC: tracking down inefficiencies in the memory hierarchy via 77% 
[?j binary rewriting 

Jaydeep Marathe , Frank Mueller , Tushar Mohan , Bronis R. de Supinski , Sally A. McKee , 
Andy Yoo 

In this paper, we present METRIC, an environment for determining memory inefficiencies by 
examining data traces. METRIC is designed to alter the performance behavior of applications 
that are mostly constrained by their latency to resolve memory references. We make four 
primary contributions in this paper. First, we present methods to extract partial data traces 
from running applications by observing their memory behavior via dynamic binary rewriting. 
Second, we present a methodology to represent ... 

6 Dynamo: a transparent dynamic optimization system 77% 
@) Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia 

ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 2000 conference on 
Programming language design and implementation May 2000 
Volume 35 Issue 5 

We describe the design and implementation of Dynamo, a software dynamic optimization 
system that is capable of transparently improving the performance of a native instruction 
stream as it executes on the processor. The input native instruction stream to Dynamo can be 
dynamically generated (by a JIT for example), or it can come from the execution of a 
statically compiled native binary. This paper evaluates the Dynamo system in the latter, more 
challenging situation, in order to emphasize the ... 

7 Direct execution models of processor behavior and performance 7 ?% 
@j Richard M. Fujimoto , William B. Campbell 

Proceedings of the 19th conference on Winter simulation December 1987 
This paper discusses a modeling technique for creating efficient instruction level simulation 
models of von Neumann processors. In contrast to traditional approaches which use a 
software interpreter, this technique employs direct execution of application programs on the 
host computer. An assembly language program for the target machine is decompiled to a 
high level language, instrumented, and then recompiled and executed on the host computer. 
A prototype im ... 
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8 Trap-driven memory simulation with Tapeworm II 77% 
@] Richard Uhlig , David Nagle , Trevor Mudge , Stuart Sechrest 

ACM Transactions on Modeling and Computer Simulation (TOMACS) January 1997 

Volume 7 Issue 1 



9 Shade: a fast instruction-set simulator for execution profiling 77% 
@j Bob Cmelik , David Keppel 

ACM SIGMETRICS Performance Evaluation Review , Proceedings of the 1994 ACM 
SIGMETRICS conference on Measurement and modeling of computer systems May 

1994 

Volume 22 Issue 1 

Tracing tools are used widely to help analyze, design, and tune both hardware and software 
systems. This paper describes a tool called Shade which combines efficient instruction-set 
simulation with a flexible, extensible trace generation capability. Efficiency is achieved by 
dynamically compiling and caching code to simulate and trace the application program. The 
user may control the extent of tracing in a variety of ways; arbitrarily detailed application 
state information may be collected ... 

10 Fast instruction cache performance evaluation using compile-time analysis 77% 
(3 David B. Whalley 

ACM SIGMETRICS Performance Evaluation Review , Proceedings of the 1992 ACM 
SIGMETRICS joint international conference on Measurement and modeling of 
computer systems June 1992 
Volume 20 Issue 1 
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Fast Context Switches: Compiler and Architectural Support .. - Snyder. Whalley. Baker (Correct) (1 citation) 

on a live range that has been allocated to a scratch register. 2 The scratch register can be 

this paper attempts to avoid saving and restoring registers by performing context switches at points in the 

www.cs.feu.edu/~whalley/papers/mnm95.ps 

A New Fast Algorithm for Optimal Register Allocation in.. - Lelait Gao. Eisenbeis (1998) (Correct) (1 citation) 
En Automatique A New Fast Algorithm For Optimal Register Allocation In Modulo Scheduled Loops Sylvain 
ftp.inria.fr/INRIA/publication/RR/RR-3337.ps.gz 

Global Register Allocation Based on Graph Fusion - Guei-Yuan Lueh (1996) (Correct) (7 citations) 
Global Register Allocation Based on Graph Fusion Guei-Yuan Lueh 
www.es. emu. edu/afs/cs.cmu.edu/project/iwarp/archive/1x-papers/lcpc96.ps 

Register Allocation over the Program Dependence Graph - Norris, Pollock (1994) (Correct) (19 citations) 
Register Allocation over the Program Dependence Graph 
www.eecis.udel.edu/pub/people/pollock/rap.ps 

Code Reordering and Speculation Support for Dynamic.. - Nystrom. Barnes.. (2001) (Correct) 
ordering of exceptions and the observed processor register contents at each exception point must be 
live out of branch I. Compilers may also register allocate across exception paths allowing registers to be 
the counter will contain the appropriate index into the table. The processor discards any 
www.crhc.uiuc.edu/IMPACT/ftp/conference/pact-01-speculation.ps 

Software-Directed Register Deallocation for.. - Lo. Parekh, Eggers, (Correct) (2 citations) 
and Distributed Systems Software-Directed Register Deallocation for Simultaneous Multithreaded 
www-cse.ucsd.edu/users/tullsen/TPDS99.ps 

Dynamic Statistics of Sequential Prolog - Celis. Mills (Correct) 

the architecture has 8 argument registers and no scratch registers. Hence, for each procedure calls that 
is done by updating the current choice pointer register (B) with the value of the previous choice 
and trust_me_else. Every time an environment is allocated, the flag and the current choice point are 
ftp.cs.indiana.edu/pub/techreports/TR390. ps.Z 

Register Relocation: Flexible Contexts for Multithreading - Waldspurger, Weihl (1993) (Correct) (25 citations) 
Register Relocation: Flexible Contexts for 

www.research.digital.corrVSRC/personal/CarLWaldspurger/papers/register-isca93.ps 

Register Communication Strategies for the Multiscalar.. - Vijaykumar Scott (1996) (Correct) (2 citations) 

1 Register Communication Strategies for the Multiscalar 

ftp.cs.wisc.edu/sohi/trs/register.1333.ps.gz 

Sub-element Indexing and Probabilistic Retrieval in the POSTGRES .. - Fontaine (1995) (Correct) (1 citation) 
that user can also define their own functions and register them with the database. POSTGRES also supports a 
Sub-element Indexing and Probabilistic Retrieval in the POSTGRES 

throughout the collection. We developed an indexing mechanism based on sub-element indexing to 
wuarchive.wustl.edu/packages/postgres/papers/CSD-95-876. ps.Z 

Specification and verification of the Windows Card runtime.. - Gurevich. al. (1999) (Correct) 

an RTE application never leaves its "sandbox" in scratch memory, never learns anything about the rest of 

linux.eecs.umich.edu/.5/groups/gasm/wincard,ps.gz 
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